Floating Forests: Quantitative Validation of Citizen Science Data Generated From Consensus Classifications

نویسندگان

  • Isaac S. Rosenthal
  • Jarrett E.K. Byrnes
  • Kyle C. Cavanaugh
  • Tom W. Bell
  • Briana Harder
  • Alison J. Haupt
  • Andrew T.W. Rassweiler
  • Alejandro P'erez-Matus
  • Jorge Assis
  • Ali Swanson
  • Amy Boyer
  • Adam McMaster
  • Laura Trouille
چکیده

Large-scale research endeavors can be hindered by logistical constraints limiting the amount of available data. For example, global ecological questions require a global dataset, and traditional sampling protocols are often too inefficient for a small research team to collect an adequate amount of data. Citizen science offers an alternative by crowdsourcing data collection. Despite growing popularity, the community has been slow to embrace it largely due to concerns about quality of data collected by citizen scientists. Using the citizen science project Floating Forests (http://floatingforests.org), we show that consensus classifications made by citizen scientists produce data that is of comparable quality to expert generated classifications. Floating Forests is a web-based project in which citizen scientists view satellite photographs of coastlines and trace the borders of kelp patches. Since launch in 2014, over 7,000 citizen scientists have classified over 750,000 images of kelp forests largely in California and Tasmania. Images are classified by 15 users. We generated consensus classifications by overlaying all citizen classifications and assessed accuracy by comparing to expert classifications. Matthews correlation coefficient (MCC) was calculated for each threshold (1-15), and the threshold with the highest MCC was considered optimal. We showed that optimal user threshold was 4.2 with an MCC of 0.400 (0.023 SE) for Landsats 5 and 7, and a MCC of 0.639 (0.246 SE) for Landsat 8. These results suggest that citizen science data derived from consensus classifications are of comparable accuracy to expert classifications. Citizen science projects should implement methods such as consensus classification in conjunction with a quantitative comparison to expert generated classifications to avoid concerns about data quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Citizen science networks in natural history and the collective validation of biodiversity data.

Biodiversity data are in increasing demand to inform policy and management. A substantial portion of these data is generated in citizen science networks. To ensure the quality of biodiversity data, standards and criteria for validation have been put in place. We used interviews and document analysis from the United Kingdom and The Netherlands to examine how data validation serves as a point of ...

متن کامل

The Role of Citizen Science in Earth Observation

Citizen Science (CS) and crowdsourcing are two potentially valuable sources of data for Earth Observation (EO), which have yet to be fully exploited. Research in this area has increased rapidly during the last two decades, and there are now many examples of CS projects that could provide valuable calibration and validation data for EO, yet are not integrated into operational monitoring systems....

متن کامل

Distribution models for koalas in South Australia using citizen science-collected data

The koala (Phascolarctos cinereus) occurs in the eucalypt forests of eastern and southern Australia and is currently threatened by habitat fragmentation, climate change, sexually transmitted diseases, and low genetic variability throughout most of its range. Using data collected during the Great Koala Count (a 1-day citizen science project in the state of South Australia), we developed generali...

متن کامل

Machine Learning and Citizen Science: Opportunities and Challenges of Human-Computer Interaction

Background and Aim: In processing large data, scientists have to perform the tedious task of analyzing hefty bulk of data. Machine learning techniques are a potential solution to this problem. In citizen science, human and artificial intelligence may be unified to facilitate this effort. Considering the ambiguities in machine performance and management of user-generated data, this paper aims to...

متن کامل

Community-as-a-Service: Data Validation in Citizen Science

Currently, most citizen science projects that adopt a crowdsourcing model focus primarily on collecting and analyzing data. As yet, few of them leverage community interactions for effective data validation yet, which would have significant impact on improving the quality of the increasing volume of citizen science data. In this paper, we introduce an exploratory pilot study focused on understan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018